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Abstract 



The Schrodinger equation can be derived using the minimum Fisher informa- 
tion principle. I discuss why such an approach should work, and also show 
that the Kahler and Hilbert space structures of quantum mechanics result 
from combining the symplectic structure of the hydrodynamical model with 
the Fisher information metric. 
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I. INTRODUCTION 

In a previous paper Jl|, it was shown that the hydrodynamical formulation of the 
Schrodinger equation can be derived using an information-theoretical approach that is based 
on the principle of minimum Fisher information. A derivation along similar lines is also 
possible for other non-relativistic quantum mechanical equations, such as the Pauli equation 
and the equation for the quantum rotator [[J. The purpose of this paper is two- fold: 
to examine why such an information-theoretical approach should work, and to show that 
the Kahler and Hilbert space structures of quantum mechanics result from combining the 
symplectic structure of the hydrodynamical model with the Fisher information metric of 
information theory. The complex transformation of the hydrodynamical variables that 
puts this Kahler metric in its canonical form is the one that leads to the usual Schrodinger 
representation. 

Frieden Q was the first one to point out a connection between the principle of minimum 
Fisher information and the Schrodinger equation. Frieden and coworkers later developed 
and extended this work in a series of papers which made use of a new principle called the 
extreme physical information (EPI) principle. In this paper I will not discuss the EPI 
principle, which differs from the principle of minimum Fisher information in many ways (for 
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a review of the EPI approach, see the book by Frieden ||), but will concentrate instead 
on the information-theoretical approach used in In this approach, the emphasis is 

on using the principle of minimum Fisher information to complement a physical picture 
derived from a hydrodynamical model. Applying the principle under the assumption that 
one can describe the motion of particles in terms of a hydrodynamical model leads directly 
to Madelung's hydrodynamical formulation of quantum mechanics |§. 



II. CROSS-ENTROPY AND FISHER INFORMATION 

Let P(y l ) be a probability density which is a function of n continuous coordinates y l , 
and let P(y l + Ay 1 ) be the density that results from a small change in the y % . Expand the 
P(y i + Ay i ) in a Taylor series, and calculate the cross-entropy J up to the first non- vanishing 
term, 

j(p(y* + V) = P(y 1 )) = J P(y l + V) in P %t^ V%) (i) 



1 dPj^dPjy 1 ) 
2 J P{y i ) dyi dy k V 

I jk AyJAy k 



Ay j Ay k 



The Ijk are the elements of the Fisher information matrix. This is not the most general 
expression for the Fisher information matrix, but the particular case that is of interest here. 
The general expression is of the form 

1 f 1 dPtflP) dPtfltf) 



ijk\ 



2 J PfxW) dQi d9 k 



-d n x (2) 



where P{x l \d' 1 ) is a probability density that depends on a set of n parameters 9 l in addition 
to the n coordinates x l . The expression for the Ijk that appears in equation (JJ) can be 
derived from the general formula if 

Ptfltf) = P(x i + 

To see this, introduce a new set of parameters y l = x % + 9 % . Then 



since d n x — > d n y as the integration over the x % coordinates is for fixed values of 6 l . 

If P is defined over an n-dimensional manifold M with (positive) inverse metric g tk , there 
is a natural definition of the amount of information I associated with P, which is obtained 
by contracting g lh with the elements of the Fisher information matrix, 

/ ^ W'i / if g>,. (3) 

The case of interest here is the one where M is the n + 1 dimensional extended configuration 
space QT (with coordinates {t, x 1 , ...,x n }) of a non-relativistic particle of mass m. Then, 
the inverse metric is the one used to define the kinematical line element in configuration 
space, which is of the form g lk = diag(0, 1/m, 1/m). Sometimes it will be convenient to 
use quantities defined over the configuration space Q (with coordinates {x l , ...,x n }) rather 
than QT, and I will do so if it simplifies the notation. 

III. DERIVATION OF THE SCHRODINGER EQUATION 

In the Hamilton- Jacobi formulation of classical mechanics, the equation of motion takes 
the form 

dS 1 uv dS dS 

1_ 1_ v = (4) 

dt r dx» dx» y } 

where g^ v = diag(l/m, 1/m) || is the inverse metric used to define the kinematical line 
element in the configuration space Q parametrized by coordinates The velocity field 

is derived from S according to 

« = r%jp. (5) 

When the exact coordinates that describe the state of the classical system are unknown, one 
usually describes the system by means of a probability density P(t, The probability 

density must satisfy the following two conditions: it must be normalized, 



Pd n x = 1, 



and it must satisfy a continuity equation, 



9 :P + ^-(p 9 ^)=o. 



df ' dx? y * dx u 1 " [h> 
Equations (f|) and @, together with (|5|), completely determine the motion of the classical 
ensemble. Equations and (§) can be derived from the Lagrangian 
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+ V } dtd n x 



(7) 



I dt 2 dx^ dx 

by fixed end-point variation (5P = 5S = at the boundaries) with respect to S and P. 

Quantization of the classical ensemble is achieved by adding to the classical Lagrangian 
(0) a term proportional to the information / defined by equation (|]) |J. This leads to the 
Lagrangian for the Schrodinger equation, 



J QM — J^CL 



L C l + XI 



(8) 



n (dS 1 



OS OS 1 dP dP 
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+ V } dtd n x. 



dx^ dx v P 2 dx^ dx v 
Fixed end-point variation with respect to S leads again to (H), while fixed end-point variation 
with respect to P leads to 



a 4 + V 

dt 2 y 



dS dS , / 1 dP dP 2 d 2 P 
+ A 



+ V = 



(9) 



dxf 1 dx v ' \P 2 dx^ dx u P dx^dx" 
Equations (0) and (|9]) are identical to the Schrodinger equation provided the wave function 
i/j(t, x^) is written in terms of S and P by 



V> = VPexp(iS/h) 



and the parameter A is set equal to 



A 



h 



Note that the classical limit of the Schrodinger theory is not the Hamilton- Jacobi equation 
for a classical particle, but the equations (f|) and @ which describe a classical ensemble. 



It can be shown (see Appendix) that the Fisher information I increases when P is varied 
while S is kept fixed. Therefore, the solution derived here is the one that minimizes the 
Fisher information for a given S. 

The approach followed here is of interest in that it provides a way of distinguishing 
between physical and information-theoretical assumptions (for a very clear account of the 
importance of making this type of distinction in quantum mechanics see the paper by Jaynes 
H). In general terms, the information-theoretical content of the theory lies in the prescrip- 
tion to minimize the Fisher information associated with the probability distribution that 
describes the position of particles, while the physical content of the theory is contained in 
the assumption that one can describe the motion of particles in terms of a hydrodynamical 
model. 

IV. ON THE USE OF THE MINIMUM FISHER INFORMATION PRINCIPLE IN 

QUANTUM MECHANICS 

The cross-entropy J, 

J(Q-.p) = /W) ln (||fy) dn y- 

where P, Q are two probability densities, plays a central role in information theory and in 
the theory of inference. It has properties that are desirable for an information measure 
@, and it can be argued that it measures the amount of information needed to change a 
prior probability density P into the posterior Q |lOj . Maximization of the relative entropy 
(which is defined as the negative of the cross-entropy[]) is the basis of the maximum entropy 

1 A note on terminology: due to the connection between relative entropy and cross-entropy, the 
maximum entropy principle is also known as the minimum cross-entropy principle, which can lead 
to some confusion. The cross-entropy (or its negative) may found in the literature under vari- 
ous names: Kullback-Leibler information, directed divergence, discrimination information, Renyi's 
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principle, a method for inductive inference that leads to a posterior distribution given a 
prior distribution and new information in the form of expected values. The maximum 
entropy principle asserts that of all the probability densities that are consistent with the 
new information, the one which has the maximum relative entropy is the one that provides 
the most unbiased representation of our knowledge of the state of the system. There are 
several approaches that lead to the maximum entropy principle. In the original derivation by 
Jaynes [jllj , the use of the maximum entropy principle was justified on the basis of the relative 
entropy's unique properties as an uncertainty measure. An independent justification based 
on consistency arguments was later given by Shore and Johnson ||12|| . Jaynes had already 
remarked that inferences made using any other information measure than the entropy may 
lead to contradictions. Shore and Johnson considered the consequences of requiring that 
methods of inference be self-consistent. They introduced a set of axioms that were all 
based on one fundamental principle: if a problem can be solved in more than one way, 
the results should be consistent. They showed that given information in the form of a 
set of constraints on expected values, there is only one distribution satisfying the set of 
constraints which can be chosen using a procedure that satisfies their axioms, and this 
unique distribution can be obtained by maximizing the relative entropy. Therefore, they 
concluded that if a method of inference is based on a variational principle, maximizing any 
function but the relative entropy will lead to inconsistencies unless that function and the 
relative entropy have identical maxima (any monotonic function of the relative entropy will 
work, for example). 

It is tempting to argue by analogy that the minimum Fisher information derivation of 
the Schrodinger equation is in essence nothing but a variation on maximum entropy, one 
in which maximization of relative entropy is simply replaced by minimization of the Fisher 
information (some similarities and differences of the two approaches were discussed briefly 



information gain, expected weight of evidence, entropy, entropy distance. 
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in But if we take into consideration the unique properties that make cross-entropy 

the fundamental measure of information together with the result of Shore and Johnson, it 
becomes difficult to justify a principle of inference based on information theory that would 
operate along the same lines as maximum entropy but using the principle of minimum 
Fisher information instead. To understand the use of the minimum Fisher information 
principle in the context of quantum mechanics, it is crucial to take into consideration that 
here one is selecting those probability distributions P(y l ) for which a perturbation that 
leads to P(y l + Ay 4 ) will result in the smallest increase of the cross-entropy for a given 
5*(?/*). In other words, the method of choosing P{y l ) is based on the idea that a solution 
should be stable under perturbations in the very precise sense that the amount of additional 
information needed to describe the change in the solution should be as small as possible. 
We have then a new principle: choose the probability densities that describe the quantum 
system on the basis of the stability of those solutions, where the measure of the stability is 
given by the amount of information needed to change P(y l ) into P(y l + Ay 1 ). Why should 
restricting the choice of {P, S} to those that are stable in this sense lead to the excellent 
predictions of quantum mechanics? Such an approach should work for physical systems that 
can be represented by models in which the probability density P describes the equilibrium 
density of an underlying stochastic process (see for example the derivation of the diffusion 
equation using the minimum Fisher information principle in |13|). Such models of quantum 
mechanics do exist: a formulation along these lines was first proposed by Bohm and Vigier 
T4| , and later a different but related formulation was given by Nelson |15| (for a review of the 



stochastic formulation of the quantum theory that compares these two approaches, see [PH). 
Whether the additional assumptions needed to build these particular models are sound, and 
whether they provide a correct description of quantum mechanics will depend of course on 
the experimental predictions that they make. The minimum Fisher information approach 
can be of no help here, since it is only concerned with making inferences about probability 
distributions and operates therefore at the epistemological level. 
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V. KAHLER AND HILBERT SPACE STRUCTURES OF QUANTUM 

MECHANICS 



I now want to examine the assumptions that are needed to construct the Kahler and 
Hilbert space structures of quantum mechanics. My aim is not to give a mathematically 
rigorous derivation of these results, but to give arguments that justify introducing the Kahler 
space structure on the basis of mathematical structures that arise naturally in the hydro- 
dynamical model and in information theory. In particular, I want to show that the Kahler 
structure of quantum mechanics results from combining the symplectic structure of the hy- 
drodynamical model with the Fisher information metric of information theory. The complex 
transformation of the hydrodynamical variables that puts this Kahler metric in its canonical 
form is the one that leads to the usual Schrodinger representation. Good descriptions of 
the geometrical formulation of quantum mechanics covering the case of infinite-dimensional 



Kahler manifolds are available in the literature; see for example Cirelli et. al. ||17|| , Ashtekar 
and Schilling [18|] and Brody and Hughston |19]. The approach of Brody and Hughston is 
of special interest in that they make explicit use of the Fisher information metric, although 
without making reference to the hydrodynamical formulation. 

I first look at the symplectic structure of the hydrodynamical formulation. Introduce 
as basic variables the hydrodynamical fields {P,S}. The symplectic structure is given by 
the two form 

u(5P(x»),5S(x»);5'P(x»),5'S(x»)) = / { (5P(x"), 5S(x»)) ' U 1 x 1 ] 



(5P(x^),5S(x^))-n 




d n x 



d n x 



5'S(x^ 

where 5 and 8' are two generic systems of increments for the phase-space variables. The 
Poisson brackets for two functions J? rl (P, S), ^{P, S) take the form 

{F\P,S),F 2 (P,S)} = J {[8F X /8P] [5F 2 /5S] - [S^/SS] [5F 2 /5P]} d n x. 



The equations of motion (|6]), (J9|) can be written as 

dt 1 ' n| 55 

OS 6H 

m ={s ' n} = -Jp 

with the Hamiltonian 7i given by 



ft 



OS OS 
dx^ dx v 



1 dP dP 

P^dxt'dxf 



+ V )■ d n x. 



7i acts as the generator of time translations. 

To introduce the Fisher information metric, let 9^ be a set of real continuous parameters, 
and consider the parametric family of positive distributions defined by 

P(x»\6») = P(x fl + 9 >1 ) 

where the probability densities P are solutions of the Schrodinger equation (at time t = 
0). Then there is a natural metric over the space of parameters (9 M given by the Fisher 
information matrix |[20||, and it leads to a concept of distance defined by 



ds 2 (9 fl ) = - 



P(x^) 89 



dP(x^) dP{x ti \9' 1 ) x 



39" 



WW 



(10) 



Using 



dP 

5P = —59» 
39^ 



one can write equation fliTf ) as 
ds\6») = - 



(11) 



P(x»\9») 

We use equation (|TT| ) to introduce a metric over the space of solutions of the Schrodinger 
equation (i.e., P(x IM \9 tl ) with 9^ = 0) by setting 



ds 2 (5P,5'P) 



P(x») 



5P(x^)5'P{x tl )d?x 



g {p) 5P{x^)5'P{x^)d z x 
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where 



P(a^) = P(x»\6>* = 0), 



5P{x») = 5P(x»\6»)\o»=o 



* 2P(x») 

I now want to extend the metric over the probability densities to a metric g a b over the 
whole space {P, S} of solutions of the Schrodinger equation, in such a way that the metric 
structure is compatible with the symplectic structure. To do this, introduce a complex 
structure J\ and impose the following conditions, 



f^ab QacJ , 



(12) 



J a c 9abJ b d — 9cd 



(13) 



-s a 



(14) 



A set of {flab, g a b, J a f)} that satisfy equations (|12"D , ( |i~3l) and (|TJ) defines a Kahler structure. 
Equation ([12]) is a compatibility equation between Q ab and g ab , equation ( |i3|) is the condition 
that the metric should be Hermitian, and equation ([[4]) is the condition that J a b should be 
a complex structure. Let 



ab 



-1 



and require that g ab be a real, symmetric matrix of the form 



/ 



9a 



% (P) 



V 



/ 



Then the solutions g a b and J\ to equations flT^),(|T3D and (jTJ) depend on an arbitrary real 
function A and are of the form 
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9ab{A) 



t 



\ 



.4 



.4 



\ 



( 



A (Hg^y'il + A 2 ) 

-HgW -A 

The choice of A that leads to the simplest Kahler structure is A = 0, which is a unique 
choice in that it leads to the flat Kahler metric. I will show this by carrying out the 
complex transformation that leads to the canonical form for the flat Kahler metric. I set 
A = 0, and work with the Kahler structure given by 

\ 



/ 



n 



ub 



1 
-1 



(15) 



9ab 



J" 



hg^ ^ 



( (hg^y A 



(16) 



^ -% (p) J 

The complex coordinate transformation is nothing but the Madelung transformation 



(17) 



ip = VP exp(iS/H) 



iP* = VPexp(-iS/H) 



In terms of the new variables, fllS]), (16 ) and ([17]) take the canonical form 

\ 



n 



ub 



ih 
-ih 
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9ab 



J" 



\ fi0 / 



v° v 



The Madelung transformation is remarkable in that the Hamiltonian takes the very simple 
form 



n 



and the equations of motion become linear. 

Finally, one introduces a Hilbert space structure using g a b, Q a b to define the Dirac prod- 
uct. For two wave functions 0, ip define the Dirac product by 



1 

2fr 



L\ so / 



/ 



/ 



+ i 



V 



ifi 

-in o 



\ 
/J 



if [X>- 



d n x 



J 



In this way the Hilbert space structure of quantum mechanics results from combining the 
symplectic structure of the hydrodynamical model with the Fisher information metric of 
information theory. 

An important result that comes out of this analysis concerns the issue of suitable boundry 
conditions for the fields P and S. It has been pointed out |2T| that the Schrodinger 
theory is not strictly equivalent to some of the other formulations (i.e., the hydrodynamical 
formulation and stochastic mechanics) because features such as the quantization of angular 
momentum, which are natural when the theory is formulated in terms of wave functions, 
require an additional constraint in a theory formulated in terms of hydrodynamical variables. 
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For example, in the case of the hydrogen atom, the quantization of angular momentum 
results from requiring that the wave function be single-valued in configuration space. But 
the derivation of the Kahler structure and Hilbert space structure presented here shows 
that the Schrodinger representation follows naturally from the hydrodynamical formulation 
provided we take into account the role of the Fisher information metric, and furthermore 
that this representation is unique in that it is the coordinate system in which the Kahler 
structure takes the simplest form. From a purely mathematical point of view, it is not 
surprising that the correct boundry conditions are those that are simplest when formulated 
in the simplest coordinate system, i.e. single-valuedness of the canonically conjugate fields 
i/j, ip*. 



VI. APPENDIX 

I want to examine the extremum obtained from the fixed end-point variation of the 
Lagrangian Lq M , equation (g). In particular, I wish to show the following: given P and S 
that satisfy equations (||) and (^), a small variation of the probability density P(x M ,t) — > 
P(x^,ty = P(x ,x ,t) + t5P{x tl ,t) for fixed a will lead to an increase in Lqm, as well as an 
increase in the Fisher information /. 

I assume fixed end-point variations, and variations e5P that are well defined in the sense 
that P' will have the usual properties required of a probability density (such as P' > and 
normalization). 

Let P — > P' = P + e5P. Since P and S are solutions of the variational problem, the 
terms linear in e vanish. If one keeps terms up to order e 2 , the change in Lqm is given by 

ALqm = Lqm(P , S) — Lq M (P, S) 

^ f j» f {5P)2 dP dP 2{5P) dP d{6P) i 1 d{6P) d{5P) I dtrr I O (c^ 
2 J y \ P 3 dx» dx v P 2 dx» dx v P dx» dx v J { } ' 

Using the relation 

p »v 9 f sp \ d ( 6P \_ „uf 5p2dPdP 25P dP 85P lddPddP) 
9 dx^ \Tj dxv \~Pj~ 9 \ p3 dx^ dx v _ P 2 dx» dx v + P dx» dx v J ' 
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one can write ALqm as 

which shows that ALqm > for small variations, and therefore that the extremum of 
ALqm is a minimum. Furthermore, since ALqm ~ A, it is the Fisher information term / 
in the Lagrangian ALqm that increases, and the extremum is also a minimum of the Fisher 
information. 
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